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We prove violation of the strong converse property of the product state capacity for a class of 
quantum channels with long-term memory. In obtaining this result, we establish upper and lower 
bounds on the one-shot strong converse rate of an arbitrary classical-quantum channel, which in turn 
directly yields bounds on the one-shot strong converse rate for transmission of classical information 
through an arbitrary quantum channel. In contrast, it is known that the product state capacity of a 
memoryless quantum channel satisfies the strong converse property. This result is retrieved from our 
one-shot bounds in the asymptotic limit. The expression for the strong converse rate of an arbitrary 
sequence of classical-quantum channels, which was originally derived by Hayashi and Nagaoka, can 
also be obtained from our one-shot results. Our bounds on the one-shot strong converse rates are 
given in terms of a generalized relative entropy quantity, namely, the max-relative entropy. We 
prove that this quantity also characterizes the strong converse rate of asymmetric hypothesis testing 
■ in the one-shot scenario. 

S 1 

Q H ' I. INTRODUCTION 

+!> ■ 

Transmission of information through noisy channels is an essential requirement in various information-processing 
tasks. A channel can be characterized by its capacity, which quantifies the maximum amount of information which 
q-( can be transmitted reliably per use of the channel. If the sender (Alice) encodes information at a rate less than the 
i— i , capacity, then the receiver (Bob) can recover the information with a probability of error which vanishes asymptotically 
in the number of uses of the channel. For rates above the capacity, the asymptotic probability of error is bounded 
away from zero. Another quantity of interest characterizing a channel is its strong converse rate, which is the rate 
threshold above which information transmission fails with certainty, in the sense that the asymptotic probability of 
qq | error is equal to one. 

Wolfowitz proved that, for a memoryless classical channel, i.e., a classical channel for which there are no 
correlations in the noise acting on successive inputs, the strong converse rate is equal to the capacity. This is referred 
, to as the strong converse property (see e.g. Q). The capacity of the channel hence provides a sharp threshold on 
its information-carrying power. Moreover, he proved that if R is greater than the capacity C of the channel, for 
t-H \ any encoding-decoding scheme of rate R, the probability of successful decoding by Bob decays exponentially in the 
. number (n) of uses of the channel, the probability being exponentially small in the difference n(R — C). 

The strong converse property is also satisfied by the product state classical capacity of a memoryless quantum 
• ■ »"H , channel, that is its capacity evaluated under the restriction that the inputs are product states. This result was 
proved independently by Ogawa and Nagaoka Q, and by Winter Q. Recently, Konig and Wehner @ proved the 
strong converse property of the classical capacity of a quantum channel (evaluated for general inputs) for a class of 
quantum channels for which the Holevo capacity is additive. These include all unital qubit channels, the d-dimensional 
depolarizing channel and the Werner-Holevo channel. 

This brings us to the following interesting question. Does the strong converse property hold for a channel which 
is not memoryless? Using the powerful Information Spectrum method (see, e.g., [10|, HH, LLZl and references therein), 
Verdu and Han [To| proved a necessary and sufficient condition for the validity of the strong converse property for an 
arbitrary sequence of classical channels. It is expressed in terms of the equality of two quantities characterizing the 
sequence of channels, called the sup- and inf-information rates [lpf . which are defined in the Information Spectrum 
framework. An analogous result was established for an arbitrary sequence of classical-quantum (c-q) channels by 
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Hayashi and Nagaoka using the Quantum Information Spectrum Method [ll|. However, even though these results are 
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very general and elegant, they are not useful for verifying the validity of the strong converse property for any given 
channel with memory. This is because it is difficult to explicitly compute the values of the sup- and inf-information 
rates for any given channel with memory. 

In this paper we explicitly prove violation of the strong converse property for the product state capacity of a class 
of quantum channels with long-term memory. A channel in this class is given by a convex combination of a finite 
number of memoryless quantum channels. The product state capacity refers to the optimal rate of transmission of 
classical information, under the restriction of product state inputs, evaluated in the limit of asymptotically many 
uses of the channel. We compute the strong converse rate of the channel and show that it is strictly larger than the 
capacity of the channel. 

In establishing this result, we prove upper and lower bounds on the strong converse rate for a single use of an 
arbitrary classical-quantum channel - a result which is interesting in its own right. This in turn directly yields bounds 
on the one-shot strong converse rate for transmission of classical information through an arbitrary quantum channel. 
These bounds are expressed in terms of a generalized relative entropy quantity, namely, the max-relative entropy 
defined in |l2j. For a memoryless quantum channel, in the limit of asymptotically many uses, these bounds converge 
independently to the known expression of the product state capacity of the channel given by its Holevo capacity 
([3 El; see also [n|), thus yield ing the strong converse property. This also holds true for a memoryless c-q channel. 
Moreover, for an arbitrary sequence of c-q channels our one-shot bounds converge to the expression of the strong 
converse rate obtained by Hayashi and Nagaoka [Tl| in the Information Spectrum framework. 

We also prove that the max-relative entropy, which is the entropic quantity characterizing the one-shot strong 
converse rate of an arbitrary c-q channel, also characterizes the strong converse rate of asymmetric hypothesis testing 
between two quantum states (say p and a) in the one-shot setting. In this setting, one considers that only a single 
copy of the quantum state is given, and one needs to test the null hypothesis p versus the alternative hypothesis a. 
This is done by using a POVM {II, I — II}, where II corresponds to the acceptance of p, and (I — II) to the acceptance 
of a. The errors of the first and second kind are then respectively defined as a(II) := Tr[(7— II)p] and /3(n) := Trpcr], 
where a(II) is the probability of accepting a when p is true, while /3(TI) is the probability of accepting p when a is 
true. For any given e > 0, the one-shot strong converse rate is roughly defined as the minimum value of — log/3(II) 
for which a(II) > 1 — e for all POVMs {II, / — II}, and we establish upper and lower bounds on it in terms of the 
max-relative entropy of p with respect to a. If instead one is given multiple identical copies of the quantum state, 
i.e., either p® n or cr®", then in the asymptotic setting (n — > oo) our bounds on the one-shot strong converse rate 
converge to the quantum relative entropy of p with respect to <r, and one retrieves the result first proved by Ogawa and 
Nagaoka [l6| . This result further strengthens the connection between asymmetric hypothesis testing and transmission 
of classical information through a quantum channel [ll|, E^| • 

Our long-term memory channel can be considered as the quantum analogue of an averaged channel, introduced by 
Jacobs d , which was the first example of a classical channel exhibiting a violation of the strong converse property. A 
natural generalization of an averaged channel is a compound channel. In this case, an arbitrary set S of memoryless 
channels are provided but the probability of information being transmitted through the different channels in the set 
is not known. The sender and receiver merely know that the actual channel belongs to the set S. For a classical 
compound channel, Wolfowitz [l| proved that the strong converse property holds, if the maximum probability of error 
is used in the definition of the capacity. The same holds for the classical capacity of a quantum compound channel, as 
proved by Bjelakovic and Boche [f|. In contrast, Ahlswede Q gave an example which demonstrated that the strong 
converse property need not hold for classical compound channels if the average probability of error is used in the 
definition of the capacity. 

The paper is organized as follows. In Section [TT] we introduce the necessary notations and definitions. In Theorem 
1111 of Section IHII we state upper and lower bounds on the strong converse rate for a single use of an arbitrary c- 
q channel. These in turn directly yield bounds on the one-shot strong converse rate for transmission of classical 
information through an arbitrary quantum channel, which are stated in Theorem 1131 The proof of Theorem [TTJ is 
given in Section I VIII In Section IIVI we introduce a class of c-q channels with long-term memory, and explicitly 
compute the strong converse rate in the limit of asymptotically many uses of a channel in this class. We prove that 
the strong converse rate is strictly larger than the capacity of the channel, the latter being obtained directly from [l5[ . 
Hence the capacity of the channel violates the strong converse property. In Section [V] we recover the validity of the 
strong converse property of the product state capacity of a memoryless quantum channel from our one-shot results. 
In Section IVI1 we establish the strong converse rate for an arbitrary sequence of c-q channels, thus recovering the 
result given in . In Section IVIIII we prove bounds on the strong converse rate of the hypothesis testing problem in 
the one-shot scenario and prove how these lead to the known result [l6| in the asymptotic setting. 
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II. NOTATIONS AND DEFINITIONS 

Let B(H) denote the algebra of linear operators acting on a finite-dimensional Hilbert space H, and let T)(H) C B(H) 
denote the set of positive operators of unit trace (states) . A quantum channel is given by a completely positive trace- 
preserving (CPTP) map Af : B(Ha) >-> B(%b), where Ha an d Hb are the input and output Hilbert spaces of the 
channel. Throughout this paper, we restrict our considerations to finite-dimensional Hilbert spaces. 

The trace distance between two operators A and B is given by 

\\A-B\li :=Tr[{A>B}(A-B)] - Tr[{A < B}{A - B)], 

where {A > B} denotes the projector on the subspace where the operator (A — B) is non-negative, and {A < B} := 
I — {A > B}. We make use of the following lemmas. 

Lemma 1 \P\ l For self-adjoint operators A and B, and any < P < I , the following inequalities hold. 

Tr[P(A-B)} < Tr[{A>B}(A-B)], (1) 
Tr[P(A-B)} > Tt[{A< B}(A- B)}. (2) 

Lemma 2 fl$i ] Given a state p and a self-adjoint operator a, for any real 7, we have 

Tr[{p> 2^a}a] < 2" 1 . (3) 

The following entropic quantities are used in this paper, the logarithms in their definitions being taken to base 2. 
For any state p and positive operator ct, the quantum relative entropy is defined as 

S(p\\a) := Tr^plogp — plogcr), if suppp C supper, 

:= +00 else. (4) 

The von Neumann entropy of a state p is given by S(p) = — Tr(plog p). We also employ the max- relative entropy 
[I2J which is defined as follows. 

Definition 3 The max-relative entropy of a state p and a positive operator a is defined as 

D max (p\\a) := logmin{A : p < Act}. (5) 
For any < e < 1, the e -smooth max-relative entropy of two states p and a is defined as 

D L^iP\W) ■= min D max (p||CT) 

peB E ( P ) 

= min logmin{A : p < Act}, (6) 

p£B E ( P ) 

where B £ (p) :— {p > : \\p — p\\ 1 < e, Tr p < Tr p}. 

The e-smooth max-relative entropy is quasiconvex. This follows from the quasiconvexity of the max-relative entropy 
(Lemma 9 of 

as is proved below. 
Lemma 4 For any < e < 1, 

i i 

where for each i, 7^ > 7 pi, o~i are states, and ji = 1. 
Proof. We have 

- D Lax(y'7^liy'7^) = min D max (p\\ ^7^) 

it 1 

< D max (p\\J2im) (8) 

i 

for some p £ B s (J2iliPi) specified in what follows. Choose p = Y^i^i^it with Vi € B e (pi) such that 

D L ax (Pi\Wi) = D max (vi\\ai) Vi. (9) 
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We can verify that 

i i i 

i 

< X>e 

i 

= e. (10) 

For this p, and continuing from (J8J), we have 

Anax(|0|| 2 , 7i°"i) < max ("ilk*) 

* * 2 

= max -Dmax(ft 1 1 0i) ( u ) 

where the first inequality follows from the quasiconvexity of the max-relative entropy (Lemma 9 of (H)]) and the 
equality follows from the choice of V{ in ©. 
■ 

Lemma 5 For a classical- quantum state pxB = J2 x ex Px\x)(x\ x ® arc arbitrary state erg, and px — Tr B Pxb, 
we have 

D max (p X B\\px ® o\b) = max£> max (pf ||cr B )- (12) 
Proof. Let us choose A = 2 D ^P XB ^ x ^ aB '> . Then 

PXB < A(px ®(7fl). (13) 

Since the states pxB and px ® gb can be expressed as 

p XB = ^2 \x)(x\ (gi ipxPx) and p x ® cr B = 2~Z I 21 ) (^1 ® (Px^ 3 ), 

X X 

respectively, it follows that (|T3|) is satisfied if < \a B V x E X. This in turn implies that 

maxpf < Acre, 

x^X 



and hence we conclude that 



maxAnax^ \ \a B ) < log A = AnaxCPXsllPX ® C B ) 
x£X 



Conversely, let A' = 2 max ^ x d ™*(p% \° B ) . It follows that 

Px < ^'pb for any x £ X 

and the following also holds 



Therefore, 



maxL> max (pf |cr s ) = log A' > D max (p X B\\px ® ctb) 



Lemma 6 J^j / Given a state pab, let Pa(B) — ^BtA) Pab- Then for any operator uja > 0, 



min D(pab\\ua ® o- B ) = D(pab\\oja ® Ps). (14) 
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Lemma 7 JTHIj Let p and a be density operators. Then there exists an integer A > -Df nax (p||c) for which e' = 
^/8Tr[{p>2*a}p]. 

We also need the following definition of £-operator-smooth min-relative entropy denned in [l9|, [2l[ . 
Definition 8 For any e > and p, a G T>(H), the e- operator- smooth min-relative entropy is defined as 

Dwin(fi\W) ■■= Q max j (-logTrQa). (15) 

TrQp>l-e 

III. ONE-SHOT STRONG CONVERSE RATE FOR C-Q CHANNELS 

A classical-quantum (c-q) channel is defined by a triple (W,X,T~Lb), where X is a finite set (called the input 
alphabet), % is a Hilbert space, and W maps elements of X into density operators oi^b. Henceforth, we will denote 
the c-q channel simply by W. Elements of X are the possible inputs of the channel, and ran W :— {W(x)} x ex is 
the set of possible outputs of the channel. In order to use the channel for transmitting classical messages, Alice (the 
sender) has to assign codewords (i.e., elements of X) to each of her messages. The codewords are sent to Bob (the 
receiver) through the channel. Bob then does a measurement (described by a POVM) on the outputs of the channel, 
in order to infer Alice's messages. 

A classical-quantum (c-q) code C(W) for the c-q channel W is a triple (M, (p, H) which consists of the following: 

• Alice's encoding ip that maps {1, 2, • • • , M} to a subset in X. 

• Bob's decoding POVM n := {njf^ acting on the channel outputs {W {tpij))}^. 

Here 1,2, ... ,M are the labels of Alice's messages, (p(l), . . . , <f{M) are the codewords, and Eli, ... , IIm are the POVM 
elements used to discriminate the states W(<p(l)), . . . , W(ip(M)). The number of codewords, M, is the size of the 
code. 

The average error probability of a C(W) := (M, <p, H) c-q code is defined as follows. 

1 M 

Pe(C(W)) :=— YTbWWW-V*)]- (16) 

i=l 

For any given e > 0, we define the one-shot e-error capacity and the one-shot e-error strong converse rate as follows: 

Definition 9 (one-shot e-error capacity) For a given e > 0, the one-shot e-error capacity of a c-q channel W is 
defined as follows: 

CP(W) := sup{logM : 3C(W) := (M,(p,U) s.t. p e (C(W)) < e}. (17) 

Note that it denotes the maximum number of bits that can be transmitted through a single use of the channel with 
average error probability of at most e. 

Definition 10 (one-shot e-error strong converse rate) For a given e > 0, the one-shot e-error strong converse 
rate of a c-q channel W is defined as follows: 

C; {1) {W) :=inf{logM: VC(VK) := (M,tp,Il) Pe (C(W)) > 1 - e}. (18) 

One can interpret this quantity as being equal to the minimum number of bits which if transmitted through a single 
use of the channel yields an average error probability of at least 1 — e. 

The following theorem gives upper and lower bounds on the one-shot strong converse rate of a c-q channel W. It 
is proved in Section IVlII 

Theorem 11 [One-Shot Strong Converse Rate of Classical-Quantum Channels] 

For any < e < 1 and e' > j^, the one-shot strong converse rate C* (W) of a c-q channel W satisfies the following 
bounds: 

maxZ^ (p XB \\p x ® p B ) - log .^.^ ' } < C* e ^{W) < m & xD^(p XB \\p x ® p B ) + log3/e (19) 
px (1 + e)e' — 2e px 
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where px = {Px}xex denotes a probability distribution on the input alphabet X. In the above, the state pxB has the 
following form: 

Pxb = ^Pw\x){x\®W{x), (20) 

x£X 

where {\x)} x ex forms an orthonormal basis in T-L x , W(x) G T>{fH. B ) denotes the output of the c-q channel W corre- 
sponding to the input x € X , and px(B) — Tr^(x) Pxb denote its reduced states. 

Note that the bounds on the one-shot strong converse rate are given in terms of the smooth max-relative entropy (JSJ) ■ 
In contrast, Wang and Renner proved [l9| that one can obtain bounds on the one-shot capacity of a c-q channel W 
in terms of the operator-smooth min-relative entropy (|15l) as follows. 

Theorem 12 [One-shot Capacity of Classical- Quantum Channels [lj|] 

For any < s' < e, the one-shot e-error channel capacity cj 1 (W) of a c-q channel W : X t— > T>(Hb) is bounded as 
follows: 

1 (2 + e + £~ 1 ) 

m&xD £ min (px B \\px ®Pb) - log TTT \~} < 0^'(W) < maxD e min (p XB \\p x <8> Pb), (21) 

px e — (1 + eje' px 

where pxB is defined by &20\) . 

See also [20] for alternative expressions for bounds on ci X \W). 



A. One-Shot Strong Converse Rate for transmission of classical information through a quantum channel 

Our results on the one-shot strong converse rate for c-q channels directly yield bounds on the one-shot strong 
converse rate for transmission of classical information through a general quantum channel Af : B(Ha) B(Hb)- For 
any < e < 1, we denote the one-shot £-error classical capacity of Af as Ce (Af), which is defined through (fTTl) . with 
the c-q channel W replaced by Af. Similarly, we denote the one-shot e-error strong converse rate as C*^(Af), which 
is defined through analogously. 

Theorem [TT1 directly yields the following upper and lower bounds on C* e (Af). This is because a c-q channel W 
can be viewed as the composition of two CPTP maps, W = Af o £, where Af is the quantum channel, and £ encodes 
the classical messages into quantum states which are then transmitted through N '. 

Theorem 13 [One-Shot Strong Converse Rate of a Quantum Channel] 

For any < e < 1 and e' > j^, the one-shot e-error classical strong converse rate C*^ (Af) of a quantum channel 
Af : B(Ha) t-t B(Hb) satisfies the following bounds: 

max (pxbWpx <8 Pb) - log ^ < C* e ^(Af) < max D^(p XB \\p x ® p B ) + lo g 3/e (22) 

where {p x ,Px} denotes an input ensemble to the channel, 

Pxb ■■= P*\ x )( x \ ®N(px), (23) 
xex 

and px(B) = Tr.B(x) Pxb denote its reduced states. 

Correspondingly, upper and lower bounds on the one-shot e-error classical capacity C^(Af), stated in Theorem 1141 
below, follow directly from Theorem 1 121 



Theorem 14 [One-Shot Classical Capacity of a Quantum Channel [li 

For any < e 1 < e, the one-shot e-error classical capacity Ce (Af) of a quantum channel Af : B(Ha) ^ B(H B ) is 
bounded as follows: 

max D< n (p XB \\p x <8> Pb) - log ) < C { p(N) < max D e min (p XB \\p x ® p B ), (24) 

{Px,Px} £— (l + £)£ {Px,Px} 

where pxb is defined by 
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IV. VIOLATION OF THE STRONG CONVERSE PROPERTY IN A CLASS OF LONG-TERM 

MEMORY CHANNELS 



We consider a class of quantum channels with long-term memory, which are convex combinations of a finite number 
of memoryless quantum channels. For a channel in this class, n successive uses is given by the map A^"** : B(l-L® n ) — > 
B(Hg n ) and its action on any input state p^ € T>(H® n ) is given as follows: 

K 

t=i 

where each fa : B(Ha) —> B(Hb), i = 1, • • • , K, is a memoryless quantum channel, 7, > for each i G {1,2,..., A'}, 
and X)fc=i 7« = ^ n °th er words, the channel consists of K memoryless branches fa, fa, ■ ■ ■ , 4>k- On using the 
channel, an initial random choice is made as to which of these memoryless branches the successive inputs are sent 
through, the inputs being sent through fa with probability 7^. Note that if the first input is sent through fa then all 
successive inputs are also sent through it. Hence the channel has long-term memory. For convenience we denote the 
channel simply as Af. 

Suppose Alice has a set of classical messages, labelled by elements of the set A4 n ■— {1,2, . . . ,M n }, which she 
would like to communicate to Bob via such a channel. To do this, she encodes each message into a quantum state 
p(n) g U(fH® n ) which she then sends to Bob through n uses of the channel. 

This long-term memory quantum channel was studied in [Til ]. The authors evaluated a single-letter expression for 
the classical capacity of the channel under the restriction of product state inputs, i.e., the so-called product state 
capacity, in the limit of asymptotically many uses of the channel, which is defined as follows. 

If there exists an N G N such that for all n > N there exists a sequence of codes {C^ n '(Af^ n ') := 
(JifW^W^W)}^, of sizes MM > 2 nR , for which the probability of error p e (C^ n \M {n) )) -> as n -> 00, then R 
is said to be an achievable rate. The product state capacity C p (J\f) of the channel Af is defined as the supremum of 
all achievable rates, under the restriction that the inputs to the channel are product states. 

The following theorem was proved in [15j . 

Theorem 15 [l5| The capacity of the long-term memory channel A/" defined through \25}) is given by 

C P {N) := max min x({Px,(f>j(Px)}) (26) 

{Px,Px} 1<3<K 

where x{{Px, faj(Px)}) denotes the Holevo x~<iuantity of the ensemble {Px,4>j{Px)} of quantum states, 

x({Px,fa{Px)}) ■= S(^2p x fa(p x )) -^2p x S(fa(p x )), (27) 

X X 

with {p x ,p x } denoting an input ensemble of quantum states. 

Similarly, the corresponding strong converse rate C* (Af) is defined as the infimum of R such that for any sequence of 
codes {CO^A^™))}^! of sizes M<") > 2 nR , p e (C {n) (N^)) -> 1 as n4 00, once again the inputs being restricted to 
product states. The strong converse rate C* (Af) is obtainable from the one-shot strong converse rate of the channels 
A/" (n) as follows. 

C* p {N) := lim limsup -C* e (1 \N {n) ). (28) 

We prove that the strong converse rate for the long-term memory quantum channel Af ((23)) is given by the following 
theorem. 

Theorem 16 The strong converse rate of the long-term memory channel Af, defined through &25]) . is given by 

Cp(Af) := m&x m&x X {{Px,fa{Px)}) 

{Px.Px} 1<*<"^ 

= max (29) 

l<t<K 

where x({Px, fa; (Px)}) is the Holevo \-quantity defined through ||j7[ ) and 

X*(fa) ■= max x({Px,fa{Px)}) (30) 

{Px.Px} 



is the Holevo capacity of the memoryless channel fa . 
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Note that C* (AT) > C P (N) and hence the channel exhibits a violation of the strong converse property for transmission 
of classical information, under the restriction of product state inputs. 
Proof of C*(7V) < maxKKjcx't^). 

Since we consider transmission of classical information under the restriction of product state inputs, from the upper 
bound in Theorem 1131 we have 

Q(D (A ^)) < max D^ x ( PX n Bn \\ PX n ® flgn) + log -, (31) 
{Px,Px} S 



where 



K 



with 



Px n B n ■= E P*\ x )i x \ ® <t>M = (pxb) n (32) 



\xex 



and 



PXn ■= (E P»l*Xs|) ,8n = (Pxf n , P Bn ■= (E P^MY" = (p B f n • (33) 



iin . / - — \ <Sm 

) =( Px f\ A„:=(Ei 

Then from (|3 1 j) and the quasi-convexity of the smooth max-relative entropy (Lemma it follows that 

3 



C*(i) (A A«)) < max max .D^Upx^Wpxn ® p B n) + log 

{Px,Px}l<«<^ E 



D^ x ((p XB r n \\( Px ® PB f n )+\o g l. (34) 



= max max 

{p*,P„}l<i<K """" V v "~ ' " v " ' 7 : 

The above bound, and the definition of the strong converse rate, together imply that 

C* p (AO < max max lim lim sup -D^ x ( (p XB f n | | ( P x ® pfe) ®") 

{j>x,Px} l<i<-ffe->0 n^oo n V y 

= max max S'^sllpx ® p B ) 

lPx,Px} 1< J <" 



< 



max max V^xSOMAjOIIP-b) 

{ P x,Px} ^ 

= max max y({p X) 0i(p x )}) 

{Px,Px} 1< 4 <"^ 

= max . (35) 

l<?<iv 

We obtain the second line of ([55)) by using the identity 

hmlimsupiAna X (^"lk ") = S(p|k), (36) 

which follows from Lemma [inland (14T>1) . The third line of (|35p follows from the joint convexity of the relative entropy 
and the second-last line follows from the definition of the Holevo ^-quantity (|27|) . ■ 

Proof of C* p {N) > maxKKK X *(&)- 

To prove that C*(Af) > maxi<i<x x*(</>i) it suffices to prove that for any < R < maxi<i<x x* (<fii), there exists 
a sequence of codes {CW(A^ (n) )}^ = i such that 

Pe {C {n) {M {n) )) 1 as n->oo. (37) 
In the above M^ n > denotes n successive uses of the long-term memory channel M and is defined through (|25|) . 
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Without loss of generality, let us assume that the j th memoryless branch <pj of the long-term memory channel Af 
satisfies 

max X*(&), (38) 

1<«<K 

where X*^) 1S defined by ([3"0"1) . Then it follows from the HSW theorem ([l3|, that for any R < X*{<t>j) there 
exists a sequence of codes {C(")(^ n) ) := (Ml"', tf"')}, with Af<") = L2™ fl J , such that 

Jim ft(C<«)(^)) = Km £ Tr (l - ^fV^))*) = 0. 

i—l 

Since Alice's messages are sent through the j th branch with probability 7^, if Alice and Bob use the code C^ n \<j^) 
for transmission of classical information through the long-term memory channel A/", under the restriction of product 
state inputs, then the probability of successful transmission is at least 

lim (l-p e (CW(#))) >7i >0. 

Consequently ([57]) holds. ■ 



V. STRONG CONVERSE PROPERTY FOR MEMORYLESS CHANNEL 



It is known that the product state capacity of a memoryless quantum channel (as also the capacity of a memoryless 
c-q channel) satisfies the strong converse property @, 3 ■ In this section, we prove how this result can be obtained 
from our result on the one-shot strong converse rate (Theorem [13]) and the corresponding result fTheorem ll4j) on the 
one-shot classical capacity of a quantum channel. 

For a memoryless quantum channel Af, the strong converse rate for product state inputs, and the product state 
capacity are respectively expressed in terms of the corresponding one-shot quantities as follows. 



C*(Af) = limlimsup-C: (1) (A/-® n ) (39) 

CJAf) = limliminf ~CP(Af® n ). (40) 
For any input ensemble {p x , p x } and a memoryless quantum channel Af, let 

PxB = ^Px\x){x\®N{p x ), and p XnBn = (pxb)®"- (41) 

X 

Theorem [T3l provides bounds on Ct 1 (Af® n ) in terms of the quantity fmax(w„B„ \\px n ® Pb„)- Then from (l39l) and 
(gBJ), it follows that 

CJ(A0 = max S(p X b\\px <8> Pb) 
= max x{{PxM{px)}) 

= X*W, (42) 

where x* (•A/') denotes the Holevo capacity of the channel Af. 

Similarly, Theorem H4l provides bounds on cj 1 ^ (Af® 71 ) in terms of the quantity Df nin (px n B„ I \px„ ® Pb„)- From 
Quantum Stein's Lemma (\loL [22|. see also Lemma 2 in [19|]), we have that 

lim lim -D^ n (p^\\a® n ) = S(p\\a), (43) 

e— >0 n— >oo n 

Then from ([41]) and ([43j) it follows that 

C p {Af)=X*W)- 
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Hence the strong converse property is satisfied for the product state capacity of a memoryless quantum channel. The 
proof of the strong converse property for a memoryless c-q channel follows analogously from Theorems [TT] and 1121 

Further for any real number R > C P (N), for any encoding-decoding scheme of rate R, the probability of successful 
decoding by Bob decays exponentially in the number (n) of uses of the channel, the probability being exponentially 
small in the difference n(R — C). This can be seen by a calculation similar to (1651) . the term 2 7 /M on the fifth line 
of (|65|) being replaced by the term 2~ n ( R ~ Cp W s> in this case. 



VI. STRONG CONVERSE RATE FOR AN ARBITRARY SEQUENCE OF C-Q CHANNELS 

Let W := {W^}^ =1 denote an arbitrary sequence of c-q channels, with : X n H> V{T-L^ n ). In order to evaluate 

an expression for the strong converse rate of such a sequence of channels we use the formalism of the Information 
Spectrum Method [ll|, [I?} • Two fundamental quantities used in this method are the so-called quantum spectral 
sup-(inf-) divergence rates which are defined as follows fl7j . 

Definition 17 Given two sequences of states p := {p n }^=\ and a := {oVil^Li on finite- dimensional Hilbert spaces 
H n , the quantum spectral sup-(inf-) divergence rates are defined in terms of the difference operators LT n (7) = p n — 2~ < o~ n 
for any real number 7 as follows: 

D(p\\o) = inf{ 7 : lim Tr{n„( 7 ) > 0}II„( 7 ) = o} (44) 
D{p\\a) = sup {7 : lim Tr{n„( 7 ) > 0}II„( 7 ) = l) . (45) 

Note that the above definition is equivalent to the definition used in [TTJ] as was proved in (l7j . 

These divergence rates can be viewed as generalizations of the quantum relative entropy, defined by In fact, for 
two sequences of states p = {p® n }^ =1 and a — {cr^ n }^Li, it has been proved that [HI 

D(p\\a) = D(p\\a) =D(p\\a), (46) 

which is an alternative formulation of the so-called Quantum Stein's Lemma [l6l l22j 

The strong converse rate of the sequence W of c-q channels is defined in terms of the one-shot strong converse rates 
of the channels as follows: 

GZoffl) ■= lim hmsup -C* (1) (iy (n) ). (47) 

Our one-shot results (Theorem lll[) on the strong converse rate directly yields the following expression for the strong 
converse rate for W which was first derived by Hayashi and Nagaoka (Tlj . 

Theorem 18 The strong converse rate for a sequence of c-q channels W := {W^}^ =1 , with : X n ^ V{U% n ) 

is given by 

C* 00 (W)=m^x{p x ,W), (48) 

Px 

where px = {px n }^Li denotes a sequence of probability distributions, with px n = {p x 7i }x n eX' 1 > an d 

Wx,W) := inf J 7 : lim V p x „ Tr [{p x n > 2^p B J(p x n - 2"> B ,J] = 1 . (49) 

n— >oo — » 

The quantity x(px , W) can be viewed as a generalization of the Holevo %-quantity since it reduces to it for a sequence 
of memoryless c-q channels. That is, if W := {W / ® n }^Li, with W : X 1— > T)(T-Lb), and px — {p x }x£X is a probability 
distribution on the input alphabet X, then 

x(Px,W)=x({Px,W(x)}), (50) 

where 

X ({Px, W{x)}) := S(J2p*W(x)) - ^2p x S(W(x)) (51) 
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is the Holevo ^-quantity of the ensemble of quantum states {p x ,W(x)}. This can be seen as follows. Note that 
x(px , W) is expressed in terms of a sup-spectral divergence rate 

x(px,W) = D(pxb\\px®Pb), (52) 

where p XB = {px„B„}£Li, and 

Px nBn = P^\x n )(x n \^W^\x n ), (53) 

with W^ n \x n ) being the output state of the n th channel in the sequence W when the input is x" G X n ; p x = 
{px„}%Lii Pb — {pB n }n=ii w i tn Px n and pB n being the reduced states of px n B n - Now for a sequence of memory less 
channels W = {W® n }™ =1 with W : X h-> V(H B ), for each n, p x „ Bn = p x % where 

Pxb = J2p x \x)(x\ ® W(x), (54) 

a: 

with being a probability distribution on the input alphabet X . Using this fact and the identity (I46I) we infer 
that in this case 

x{Px,W) = S(pxb\\px® Pb), 

= y,p*s(w(x)\\Y,px' w ( x '»> 

x x' 

= X ({Px,W(x)}). (55) 

To prove Theorem [18] we employ the following result proved in 

Lemma 19 Given a sequence of bipartite states p = {p n }'^Li, and & sequence of positive operators a = {<7n}5£Li> 
where p n ,a n G B\H® n ), the sup-spectral divergence rate D(p\\a), defined by J^^[ ), satisfies 

D(p\\a) = limlimsupiD^ ax (p„||cr„), (56) 
e-rt) n ->oo n 

where D^ x (p n \\a n ) is the smooth max-entropy of the state p n of the sequence p, and the operator a n of the sequence 
a. 

Proof. (Proof of Theorem [T8"|) From the definition (l47l) of the strong converse rate of W and our one-shot results 
given in Theorem [TT] it follows that 

1 

C*oo{W) = limlimsup-maxD^ n / ax ( / dx„s„||/Ox„ ® p Bn ) (57) 

where p Xn ■= {p x »}x«eX», and 

px nBn := ]T P x n\x n Xx n \^W n {x n ), 

x"£X™ 

and p Xn m n ) '■= T^ Bn (x n ) Px n B„- Then using Lemma \T§[ we obtain 

CloiW) = vos3lD{pxb\\px ® Pb), (58) 

Px 

which together with (j5"2"j) yields the desired result (|35]). ■ 

VII. PROOF OF THEOREM [TTJ 
A. Proof of the upper bound in Theorem 1111 

Proof. Wc need to prove that for any real number R > max px Drlax(px B \\px ® Pb) + log'3/e and M = 2 R , the 
probability of error p e (C(W)) of any c-q code C(W) := (M, ip, II) of size M — 2 R satisfies 

Pe (C(W)) >1-E. 
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Let us choose the probability distribution px '■= {p x } to be uniform on the codebook, i.e., Px = jj for x in the 
codebook and p x = else. For such a distribution 



M 

1 

PXB = 



J~ r ^2\x){x\® Px 

definition of tl 

operator p X B € B e ^ 3 (pxB) such that 



M 

x=l 

where p x '■= W(x) £ T>(H B ). By the definition of the quantity dUI^PxbWpx ® Pb), it follows that there exists an 



Pxb < 2 D ^^\p^\p x ® pb)- (59) 

Let {n x := \x)(x\} x€ x be a complete set of rank-one projectors in B(Hx)- Define the map Vxip) '■— ^x^xP^x- 
Applying the map (Px <8> ids) on both sides of (l59"T) gives 

Pxb < 2 D '^ PXB W px ® pB \px ® p B ) (60) 

where 

Pxb = ^i^x <8> Ib)p X b(^x ® Ib) = 22 \ x )i x \ ® P x - 

X X 

Equivalently, we have, Vx G X 

7, < }_odUL(pxb\\px®pb) n „ 

Px S M * PB 

= 2T-* P b, (61) 

where 7 := max px I>max(px-B||px <& ps)- 

From the monotonicity of the trace distance under CPTP maps, we have 

Wpxb - Pxb\\i < \\pxb - Pxb\\ < e/3. 

and hence 

El^-^IL^A (62) 

X 

By Lemma [TJ we have for any < II X < J, 

Tr[Il x Mp x } < Tv[{Mp x > V p B }(Mp x - V p B )\ + V Tr[n x p B ] 

< Tv[{Mp x > 2^ PB }Mp x ] + Y< Tr[n x p B ]. (63) 

Then 

Tr[(7 - U x )Mp x ] > Tr[{Mp x < 2^p B }Mp x ] - 2^ Tr[Il x p B }. (64) 

The average probability of error of any c-q code C(W) = (M, ip, II) of size M — 2 R with R > max Px Dmax{pxB\\px ® 
p B ) + log3/e, is then given by 

Pe(C(W)) = l-J2Tr[p x (I-Il x )} 

X 

= h S Tr t M ^( J ~ n -)] + ti £ Tr[(/9:c " M ~ Px){I ~ Ux)] 

x x 

^ ^E Tr [ M ^( 7 - n -)]- £ / 3 

X 

> ^ ( Tr [W* ^ 2 >b}MpJ - 2^ Tr[n xj0B ]) - e/3 

X 

1 2 7 
= — ^Tr Mp x e/3 

> (1 - e/3) -e/3 -e/3 

= 1-e. (65) 
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The first inequality in (1651) holds because 

Tr[(p* -Mp x ) (I -It)] > ^Tr[{ P;E < Mp x }(p x -Mp x )} 

X X 

= Yl ( Tr I{^ > M ~Px}{Px - Mp x )\ - \\p x - Mp x \\x) 

X 

> -Me/3, 

where the first inequality follows from @, and the second inequality follows because from (pj2"j) we have that \\p x — 
Mp x \\i < Me/3. The second inequality in (|65|) follows from (l64l) . and the third inequality follows from the choice of 
R and the fact that 

^Tr[p, - Mp x ] <J2\\p x - Mp x \\x < e/3. 

X X 



B. Proof of the lower bound in Theorem 1111 



To prove the lower bound in (fT9|) employ the following lemma, proved by Hayashi and Nagaoka [111 ] : 

Lemma 20 Consider any c-q channel W : X i— >■ T>(Hb)- For any A € K, M € N and c > 0, t/iere exists a code 
C(W) = (M,(p,U) such that 

Pe(C(W)) < (1 + c) ( 1 - Y^p x Tt[{p x > 2 x PB } Px ]j + (2 + c + c- 1 )2- A M, 

where p x := and p B ■= J2 x PxPx- 

Note that proving the lower bound in (fT9f is equivalent to proving the following statement. There exists a code 
C(W) = (M, <p, n) of size M = [2 R \ , with 

R > D^(pxb\\px <8> Pb) - log ?^ £ w £ j , 

(1 + e)e — 2e 

such that 

Pe (C(W)) <l-e. 

Proof. Fix e > and a probability distribution = {Px}xex in terms of which the state pxB is defined (see ([20]) ). 
Choose e' > j^. By Lemma[71 for any such e' , there exists an integer A > D^^(pxb\\px ® Pb) for which 



STr[{p X B > 2 x (px® Pb)}pxb] = ^8 Tr[{ P:c > 2*p B }p x ] = VSe 1 - 

The above equation implies the existence of a POVM II = {IT^ := {p x > 2 x ps}} such that 

y^Pa TrpspJ = e'. 

Let Af = [2^] , with R < A — log ^ ^fjfpfz^g ) • Invoking Lemma [2D] and choosing c = e in it, we infer that there exists 
a code C(W) = (M, ip, II) such that 



Pe(C{W)) < (1 + e) ^1-^p^Tr^ > 2 A p B } Pl ]j + (2 + e + e^^M 



(66) 



< (l + e )(l-e / ) + (l + e)e / -2e (67) 
= 1-e. (68) 

This implies that 

C £ *«(W0 > A - log f 2±1±£± > ) > D^E( Px b\\px ® Pb) - log (2 + ' + " ' ' 



(l + E)e / -2e > /- °(l + e )e'-2e 
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VIII. STRONG CONVERSE IN QUANTUM HYPOTHESIS TESTING 

Consider the quantum hypothesis testing problem with the null hypothesis Hq : p versus the alternative hypothesis 
Hi : a. We can decide which hypothesis is true based on the POVM {II, I — II}, where < II < /. For a test II, the 
error probabilities of the first kind and the second kind are defined as 

a(U) := Tr[(7-H>] (69) 
P(IL) := Tr[mr], (70) 

where a(U) is the probability of accepting a when p is true while /3(n) is the probability of accepting p when a is 
true. 

For any < e < 1, we define the one-shot strong converse rate for hypothesis testing as follows: 

:=inf{i?:Vn if (- log/301)) >R =>> a(U)>l-e}. (71) 

The following lemma provides an upper bound on p E . 
Lemma 21 For any < e < 1, f3* {1) < D s JL{p\\o) + log2/e. 



Proof. By the definition © of D^ x (p\\a), there exists an operator p € B e / 2 (p) for which 



Hence for any < II < / for which 



we have 



p < 2 D &WWa. (72) 



log /3(H) > Duller) +lo g 2/e (73) 



Tr Up < 2^-^11^ Tr(ncj) 

= 2 D s2.wi*) i 8(n) 

< 2 £, »« (p| 1 ^ 2" (pI k) +i°s 2/e] 

= e/2. (74) 
The first inequality follows from (|72p . and the second inequality follows from (|73|) . Hence, 

Tr(n P ) = Tr(np) + Tr(n(p-p)) 

< s/2 + Wp-pWx 

< e. 

The first inequality follows from (|74|) and Lemma [TJ The second inequality holds because p £ B £ / 2 (p)- Consequently, 

a(U) = 1 - Tr(np) >l-e. (75) 

By the definition flTTj) of we have 

^(s) <D^(p\\a)+ log2/s. 

■ 

We also obtain a lower bound on the strong converse rate p e for any < e < 1 in terms of the smooth max-entropy 
as follows: 

Lemma 22 For any < e < 1, /3* {1) > D^(p\\a). 

Proof. By Lemma[71 there exists an integer A > D^L(p\\o) for which 



8Tr[{p > 2 x a}p) = VWe 



15 



The above equation also implies the existence of a projector H := {p > 2 A rr} such that 

TrUp >2e> e. 

Or equivalently, 

a(II) < 1 - e. 

On the other hand, we have 

/3(n) = TrlTcr 
< 2~ A . 

The inequality follows from Lemma[2j This implies that 8*^ > D^^(p\\a) since 

-log /3(B) >X>D^(p\\a). 

■ 

If instead one is given multiple identical copies of the quantum state, i.e., either p®" or cr®", then in the asymptotic 
setting (n — > oo) our bounds on the one-shot strong converse rate converge to the quantum relative entropy of p with 
respect to cr, and one retrieves the result first proved by Ogawa and Nagaoka [l6j . 

Definition 23 The (asymptotic) strong converse rate of the quantum hypothesis testing for the null hypothesis Hq : 
p®n versus fag alternative hypothesis H\ : cr®" is defined in terms of the one-shot strong converse rate as follows: 



8* 



lim lim sup — B* e ^\ 



with 8(11) in being evaluated as Trpcr®"]. 



The following theorem on the strong converse rate of asymptotic hypothesis testing, is obtained directly from our 
one-shot results (Lemma l21l and Lemma [ 



Theorem 24 ( |l6| |) The (asymptotic) strong converse rate for the states p and a is given by 

8* 0o = S(p\\a), 

where S(p\\o~) denotes the quantum relative entropy of p with respect to a and is defined by 
Proof. The proof follows directly from Lemma [2TI [221 and (|46|) . ■ 
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